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ABSTRACT 

AvrBs3, the founding member of the Xanthomonas 
transcription-activator-like effectors (TALEs), is 
translocated into the plant cell where it localizes to 
the nucleus and acts as transcription factor. The 
DNA-binding domain of AvrBs3 consists of 17.5 
nearly-identical 34 amino acid-repeats. Each repeat 
specifies binding to one base in the target DNA via 
amino acid residues 12 and 13 termed repeat vari- 
able diresidue (RVD). Natural target sequences of 
TALEs are generally preceded by a thymine (T 0 ), 
which is coordinated by a tryptophan residue (W232) 
in a degenerated repeat upstream of the canonical 
repeats. To investigate the necessity of T 0 and the 
conserved tryptophan for AvrBs3-mediated gene ac- 
tivation we tested TALE mutant derivatives on target 
sequences preceded by all possible four bases. In 
addition, we performed domain swaps with TalC from 
a rice pathogenic Xanthomonas because TalC lacks 
the tryptophan residue, and the TalC target sequence 
is preceded by cytosine. We show that T 0 works best 
and that T 0 specificity depends on the repeat number 
and overall RVD-composition. T 0 and W232 appear to 
be particularly important if the RVD of the first repeat 
is HD ( £ rep1 effect'). Our findings provide novel in- 
sights into the mechanism of T 0 recognition by TALE 
proteins and are important for TALE-based biotech- 
nological applications. 

INTRODUCTION 

Transcription activator-like effectors (TALEs) are bacterial 
type III effector proteins in plant-pathogenic Xanthomonas 
spp., which act as transcription factors in the plant cell 
(1). AvrBs3, the founding member of the highly conserved 
TALE family, was isolated from the pepper and tomato 
pathogen X. campestris pv. vesicatoria (Xcv) (2). We pre- 
viously showed that AvrBs3 is translocated into the plant 



cell via the type III secretion system, localizes to the nu- 
cleus and activates UPA (upregulated by AvrBs3) genes, in- 
cluding the cell size regulator UPA20 and the resistance 
gene Bs3 in pepper (3-5). TALE proteins are character- 
ized by three conserved domains: an N- terminal region 
(NTR) which harbors the type III secretion and transloca- 
tion signal, a central repeat region of variable length that 
has deoxyribonucleic acid (DNA)-binding activity and a C- 
terminal region (CTR) that contains nuclear localization 
signals (NLSs) and an acidic activation domain (AD) (1) 
(Figure 1A). The repeat region determines the specificity 
of a given TALE and represents a novel type of DNA- 
binding domain (4,5). The archetypal TALE, AvrBs3, con- 
tains 17.5 nearly-identical tandem repeats of 34 amino acids 
(aa) which differ mainly at positions 12 and 13, termed re- 
peat variable diresidue (RVD). Experimental and computer- 
based analyzes revealed a 'one repeat to one base pair' 
recognition mode of TALEs in which one RVD specifies 
binding to one nucleotide in the target sequence. The most 
common RVDs are HD, NI, NG and NN, which specifi- 
cally bind cytosine, adenine, thymine and guanine/adenine, 
respectively (6,7). Crystal structures of TALEs with and 
without DNA provided insights into the structural basis 
for the TALE-DNA interaction (8-11). The repeat region 
forms a superhelical structure that, if bound to double- 
stranded DNA, is wrapped around the DNA helix track- 
ing along the sense strand. Comparison of DNA-free and 
DNA-bound TALEs revealed a conformational change of 
the protein that is compressed upon DNA-binding (8). Each 
repeat contains two a-helices connected by a loop which 
exposes residue 13 (RVD-loop). While only amino acid 13 
mediates the specific contact to the matching base, amino 
acid 12 has a structural function by contacting the alanine 
residue (position 8) and the isoleucine residue (position 9) in 
the first helix of the same repeat which stabilizes the RVD- 
loop (8,10,12). The phosphate group of each nucleotide is 
coordinated by the residues glycine (positions 14 and 15), ly- 
sine (position 16) and glutamine (position 17) of the follow- 
ing repeat {oxy anion clip) fixing residue 13 and facilitating 
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Figure 1. Imperfect target sequences of AvrBs3 increase the importance 
of To. (A) Schematic presentation of AvrBs3 and reporter constructs. Re- 
peats are indicated by ovals [gray: degenerated repeats in the N-terminal 
region (NTR)]. T3S signal: T3S secretion signal; NLS: nuclear localiza- 
tion signal; AD: activation domain; EBE: effector binding element (up- 
per strand); No: position zero of EBE. Bold letters refer to mismatches in 
the EBEAvrBs3 • EBEs were fused to the Bs4 minimal promoter (pBs4min) 
driving expression of the (3 -glucuronidase (GUS) reporter gene uidA. (B) 
Quantification of AvrBs3 activities for different EBEs. GUS activities were 
determined 3 days after Agrobacterium-mediated delivery of effector- and 
reporter-constructs into leaves of Nicotiana benthamiana (see 'Materials 
and Methods' section). Asterisks indicate a significant difference in activ- 
ity of the same TALE-derivative tested with EBE-To (Student's t-test; *P- 
value < 0.05; **P-value < 0.01; ***P-value < 0.001). Experiments were 
performed three times with similar results. 



RVD-base specificity by a combination of positive recogni- 
tion and negative discrimination (8,12). 

In nature, the RVD -denned target sequences (effec- 
tor binding elements; EBEs) are typically preceded by a 
thymine at position zero (To), which was shown to be im- 
portant for full TALE function (6,13,14). To our knowl- 
edge, TalC from the African X. oryzae pv. oryzae (Xoo) 
strain BAI3 is the only TALE for which a natural target 
sequence is preceded by a cytosine (Co) in the promoter of 
the susceptibility gene OsSweetl4 (pOsSweet!4) (15). Struc- 
ture analyzes revealed that the TALE DNA-binding do- 
main is extended by four degenerated repeats in the NTR, 
termed repeat —3, —2, —1 and 0 (9-11). Although a non- 
canonical 'repeat zero' was predicted to coordinate bind- 
ing to Tq (6) the initial T is coordinated by repeat —1 
(10). Intriguingly, repeat —1 forms an a-helices-connecting 
loop comparable to the RVD-loop of canonical repeats. Re- 
peat — 1 contains a tryptophan residue (W232 in AvrBs3), 
which is believed to coordinate T 0 by van der Waals in- 
teractions (9,10). By contrast, Stella et al. (11) who pro- 
vide a 3D structure of DNA-bound AvrBs3 discuss that 
R266 in AvrBs3 contacts T 0 . Notably, both the tryptophan 



and arginine residues are conserved in TALE proteins. One 
exception is TalC which harbors a cysteine instead of a 
tryptophan (15). Recently, TALE homologs from Ralstonia 
solanacearum (RTLs- Ralstonia TALE-like) were described 
to function similarly to TALEs from Xanthomonas, how- 
ever, RTLs need Go in the corresponding EBEs (16). The 
NTR of RTLs differs from TALEs, but structure prediction 
suggests similar folding and that RTLs coordinate Go with 
an arginine (16,17). 

The discovery of the TALE recognition mode [TAL- 
code'; (6)] allows the target prediction of natural TAL ef- 
fectors as well as the generation of new DNA-binding do- 
mains with any desired DNA-binding specificity (18). Be- 
sides designing TALEs for gene activation different execu- 
tor domains can be fused to the DNA-binding domain, 
e.g. a Fokl nuclease. TALEs therefore became a powerful 
tool for biotechnological applications such as genome edit- 
ing (18). As the need for To limits DNA-targeting by artifi- 
cial (designer) TALEs (dTALEs) we wondered whether the 
specificity for position zero can be changed and how impor- 
tant the tryptophan at position 232 (W232) is for AvrBs3 
activity. We, therefore, analyzed AvrBs3 derivatives carry- 
ing different amino acid substitutions at position 232 in the 
context of different RVDs in repeat 1 with respect to speci- 
ficity for the initial nucleotide (No). Here, we demonstrate 
that To specificity depends on the number of repeats and the 
RVD composition, and that RVD1 affects To specificity. 

MATERIALS AND METHODS 

Bacterial and plant growth conditions 

Escherichia coli strains were grown at 37°C in lysogenic 
broth media (LB; tryptone 10 g/1, yeast extract 5 g/1, 170 
mM NaCl, pH 7.0) with selective antibiotics. Agrobacterium 
tumefaciens strain GV3101 was grown at 30°C in yeast ex- 
tract broth media (YEB; beef extract 5 g/1, bacto yeast 
extract 1 g/1, bacto peptone 5 g/1, 15 mM sucrose, 1 M 
MgS04, pH 7.2) with selective antibiotics. Nicotiana ben- 
thamiana plants were grown in the greenhouse (day and 
night temperatures of 23 °C and 19°C, respectively) with 16 
h light and 40-60% humidity. 

Generation of reporter and effector constructs 

An entry clone containing the UPA20 promoter target 
sequence of AvrBs3 (pUPA20-EBEAvrBs3) in front of the 
tomato Bs4 minimal promoter (pBs4min) (19) was used 
as template to generate p UPA20-EBEavyBs3 and the opti- 
mized EBE (EBEAvrBs3) with varying nucleotides at posi- 
tions No and Ni (Supporting information). Mutations were 
introduced by PCR using oligonucleotides TS23 and TS24- 
TS42 (Supporting information, Table SI). Entry clones with 
the EBEs of ARTrepl8-l, ARTrepl8-2 and ARTrepl8-3 in 
front of pBs4min were used as a template to generate muta- 
tions at position 0 using oligonucleotides TS23 and TS43- 
TS48 (Supporting information, Table SI). Inserts of entry 
clones were recombined into pGWB3 (20) (GATEWAY® 
LR Clonase® II Enzyme mix; Life Technologies) leading 
to AvrBs3-inducible (3 -glucuronidase (GUS) reporter con- 
structs. The coding sequence (CDS) of avrBs3 (accession 
number: X16130) was cloned by 'Golden Gate' cloning 
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Figure 2. W232 is necessary for full AvrBs3 activity. (A) Schematic pre- 
sentation of AvrBs3 and reporter constructs. The amino acid sequence of 
repeat — 1 is given; residue W232 is highlighted. (B) Relative GUS activity 
(%) induced by AvrBs3 and W232-mutants. Reporter constructs differed 
at position N 0 . AvrBs3(WT) activity with EBE(T 0 ) was set to 100%. Stan- 
dard deviation is based on the mean of three independent experiments. 
Color scale: GUS activities smaller than 100%. 



as reported (21). The avrBs3 CDS was divided into three 
modules (NTR, repeat region and CTR; Supporting infor- 
mation) which were flanked by Bsal sites and cloned into 
pJET (Thermo SCIENTIFIC). This allows assembly of sin- 
gle modules into a compatible destination vector (21). Point 
mutations in the avrBs3 NTR were introduced by PCR us- 
ing oligonucleotides TS1 + TS2 — TS18; TS19 + TS20; 
TS21 + TS22) (Supporting information, Table SI). Changes 
in AvrBs3 RVD-composition and the generation of artifi- 
cial repeat regions were accomplished using a TALE-repeat 
library based on hax3 (Supporting information) (21). The 
avrBs3 sub-modules and repeat regions were assembled as 
N-terminal c-Myc fusions into the binary vector pGGA8 
(S. Thieme unpublished; Supporting information) allowing 
expression of avrBs3 or artrepl8 constructs under control of 
the constitutive cauliflower mosaic virus 35 S promoter (ef- 
fector construct). A. tumefaciens strain GV3101 was trans- 
formed with reporter or effector constructs by electropora- 
tion. 
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Figure 3. Repeat 1 of AvrBs3 cooperates with degenerated repeats of 
the NTR. (A and B) Relative GUS activities (%) induced by AvrBs3 and 
derivatives 3 days after Agrobacterium-medisLted delivery of effector- and 
reporter-constructs into leaves of Nicotiana benthamiana. AvrBs3(WT) ac- 
tivity with EBE(To) was set to 100%. Asterisks indicate a significant dif- 
ference in activity of the same TALE-derivative tested with EBE-To (Stu- 
dent's t-test; *P-value < 0.05; **P-value < 0.01; ***P-value < 0.001). Ex- 
periments were performed three times with similar results. 



AvrBs3-activity assay 

Transient GUS reporter assays were performed as described 
(4). Agrobacterium carrying an effector and reporter con- 
struct, respectively, was resuspended in Agrobacterium infil- 
tration media (10 mM MES, 10 mM MgCl 2 , 150 |jlM ace- 
tosyringone) to an optical density of 0.8 and mixed in a 1:1 
ratio. Agrobacterium mixtures were inoculated into leaves 
of four to seven weeks old N benthamiana plants using a 
needleless syringe. Two to three days post inoculation (dpi) 
two leaf discs (diameter 0.9 cm) of three plants were har- 
vested and used for quantitative GUS activity assays (4). 
Green fluorescent protein (GFP) and 35 S: GUS (35S) served 
as negative and positive controls, respectively. Error bars are 
based on the standard deviation from three technical repli- 



Nucleic Acids Research, 2014, Vol 42, No. 11 7163 



cates. Experiments were performed three times with similar 
results. 

Analysis of protein expression 

Two to three days post infection (dpi) three leaf discs were 
harvested and ground by TissueLyser (Qiagen). Protein ex- 
tracts were mixed with 100 \x\ 4xLaemmli (250 mM Tris- 
HC1 (pH 6.8), 8% sodium dodecyl sulphate (SDS), 40% 
glycerol, 10% (3-mercaptoethanol) and boiled for 5 min. 
Protein samples were separated by 8% SDS-polyacrylamide 
gel electrophoresis and transferred to nitrocellulose. C-Myc 
tagged proteins were detected using a polyclonal c-Myc- 
specific antibody (Santa Cruz). ECL™ Anti-Rabbit IgG 
(GE Healthcare) was used for detection by enhanced chemi- 
luminescence. 

RESULTS 

Imperfect target sequences increase the importance of To 

Permutation of the AvrBs3-targeted UPA box in the Bs3 
promoter revealed that a thymine at position zero (To) is es- 
sential for the AvrBs3 -induced hypersensitive response (13). 
However, Bs3 promoter activation was not quantified. In- 
terestingly, the UPA box consensus contains a mismatch at 
position 1, which is bound by the first RVD (HD1), i.e. ade- 
nine instead of cytosine (19). Here, we investigated the ef- 
fects of mismatches in the AvrBs3 target box and quanti- 
fied promoter activation in dependency of the nucleotide 
at position zero (No). For this, we used the established 
reporter system consisting of the Bs4 minimal promoter 
(pBs4min) preceded by the AvrBs3-effector binding element 
(EBEAvrBs3) driving expression of GUS (6). AvrBs3 and 
GFP (negative control) were expressed as N-terminal c-Myc 
fusions under control of the strong and constitutive 35 S 
promoter. Both, the reporter and effector (or GFP) expres- 
sion constructs were delivered by A. tumefaciens into leaves 
of N benthamiana (Figure 1 A). We generated four reporter 
constructs differing at No containing (i) the UPA20-derived 
EBEAvrBs3 (UPA20-EBEavtBs3) and (ii) the optimized, RVD- 
defined EBEA V rBs3 ? respectively. As shown in Figure 1, we 
confirmed the importance of T 0 for activation by AvrBs3 
with the hierarchy To > Q > A 0 > Go. The comparison be- 
tween the different EBEs suggests that in case of imperfect 
target sequences the importance of T 0 increases and that 
all nucleotides at position zero work better in the optimal 
EBEAvrBs3 (Figure IB). As shown in Supplementary Figure 
SI, AvrBs3 and GFP were stably expressed. To exclude side 
effects due to mismatches we used the optimal EBEA V rBs3 in 
all following experiments. 

Analysis of AvrBs3 tryptophan (W232) mutants 

Structural data revealed that a tryptophan residue located 
in the 6 RVD -loop' of repeat — 1 is the most proximal amino 
acid to To in the target DNA. The tryptophan is believed to 
interact with the base by van der Waals forces (10). Both the 
tryptophan residue and T 0 are highly conserved in natural 
TALEs and target sequences, respectively (18). To investi- 
gate the importance of tryptophan at position 232 (W232) 
in AvrBs3 and to identify amino acids that broaden or 



change target specificity for No we generated avrBs3 mu- 
tant derivatives. The activity of AvrBs3 and derivatives was 
determined using the GUS reporter system containing the 
optimal EBE Av rBs3 (Figure 2A). Figure 2B shows that most 
amino acid substitutions in AvrBs3 led to drastically re- 
duced activity. However, substitutions of W232 by the aro- 
matic amino acids tyrosine (W232Y) and phenylalanine 
(W232F) retained the highest activity (~70 and 50%) com- 
pared to the wild- type (WT) protein and, like WT AvrBs3, 
worked best with T 0 (Figure 2B). Expression of all pro- 
teins was confirmed by immunoblot (Supplementary Fig- 
ure S2). Together, these results confirm the crucial impor- 
tance of W232 in AvrBs3. There were no AvrBs3 derivatives 
with single substitutions that significantly performed better 
with any nucleotide at position zero (No) than the WT. Only 
AvrBs3(W232R) showed slightly increased activity in com- 
bination with Go. 

Recently, a 3D structure of DNA-bound AvrBs3 was de- 
scribed (11). Notably, comparison of the structure of the 
NTR to previously published structures suggests differ- 
ent residues to be crucial for the coordination of To. In 
AvrBs3, T 0 is supposed to be coordinated by arginine 266 
(R266) in repeat 0, with the participation of R236 in repeat 
— 1 (Supplementary Figure S3A) (11). We therefore sub- 
stituted R266 in AvrBs3 by glycine and found slightly re- 
duced activity, but specificity for Tq was comparable to WT 
AvrBs3 (Supplementary Figure S3B). The AvrBs3 deriva- 
tive R236G displayed only low activity with the T 0 EBE, 
possibly due to very low protein expression levels, which was 
below the detection limit (Supplementary Figure S3B). 

The degenerated repeats cooperate with repeat 1 

The tryptophan (W232) that coordinates TALE contact to 
the base at position No is not conserved in TalC from the 
rice pathogen X. oryzae pv. oryzae (15). Instead, TalC con- 
tains a cysteine residue, which when introduced into AvrBs3 
led to low activity [AvrBs3(W232C); Figure 2B]. Notably, 
the natural target box of TalC starts with Co (15). TalC 
harbors additional substitutions and a deletion of 23 aa 
in the NTR (Supplementary Figure S4A). To test whether 
the NTR of TalC confers a preference for an EBE with Co 
we compared the activities of AvrBs3, AvrBs3(W232C) and 
chimeras between TalC and AvrBs3 (Figure 3A). As tar- 
gets we used the same four different EBEA vr Bs3 -reporters as 
in Figure 2. Surprisingly, the swap of the NTRs resulted 
in a non-functional AvrBs3 protein. Sequence comparison 
(Supplementary Figure S4A) revealed an amino acid dif- 
ference in TalC repeat 0, which according to 3D data (9) is 
located in an a-helical region that is tightly packed together 
with the neighboring helix of the canonical repeat 1. We 
therefore tested chimeras between TalC-NTR and AvrBs3 
in which we shortened the fragment contributed by TalC. 
As shown in Figure 3A, AvrBs3 activity improved when the 
protein contained only the very NTR of TalC including re- 
peat —2 [AvrBs3-N2(TalC)]. However, AvrBs3 containing 
only repeat — 1 and repeat 0 from TalC displayed very low 
activity [AvrBs3-N3(TalC)]. Only the exchange of repeat 0 is 
tolerated (AvrBs3-N5(TalC) but led to reduced AvrBs3 ac- 
tivity (Figure 3A). This confirms our hypothesis that amino 
acid differences in the helices of repeat 0 and repeat 1 af- 
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Figure 4. The RVD in repeat 1 of AvrBs3 affects specificity for To in the DNA target sequence. (A) Schematic presentation of AvrBs3-derivatives and 
DNA EBEs. AvrBs3-derivatives contain different RVDs in repeat 1 (black oval; RVD = XX) and were analyzed with corresponding EBEs (N\) that vary 
in position zero (No). (B) Schematic presentation of the analyzed AvrBs3-EBE combinations. (C and D) GUS activities induced by AvrBs3 and derivatives 
3 days after Agrobacterium-mediated delivery of effector- and reporter-constructs into leaves of Nicotiana benthamiana. Please note that inoculations and 
tissue harvest for (C) and (D) were performed together for all samples of this experiment, but for technical reasons GUS activities were determined on 
different days. Asterisks indicate a significant difference in activity of the same TALE-derivative tested with EBE-To (Student's t-test; *P-value < 0.05; 
**P-value < 0.01; ***P-value < 0.001). Experiments were performed twice with similar results. 



feet protein activity. Furthermore, the results underpin the 
necessity of W232 for AvrBs3 function. 

Next, we reasoned that the degenerated repeats (repeat 
—3 to 0) might cooperate with repeat 1 which differs be- 
tween TalC (NS1) and AvrBs3 (HD1) (Supplementary Fig- 
ure S4B). We therefore generated designer AvrBs3 con- 
structs [dAvrBs3; WT, W232C, NTR(TalC)] differing in the 
RVD of repeat 1 (RVD1; HD1 to NS1) (Figure 3B). Inter- 
estingly, NS1 led to a very good activity of AvrBs3 irrespec- 
tive of No in EBEAvrBs3 and the presence of tryptophan, cys- 
teine or alanine at position 232 and glycine at position 236 
(Figure 3B; Supplementary Figure S5). Expression of all 
AvrBs3-TalC chimera in planta was shown by western blot 
(Supplementary Figure S6). In conclusion, NS1 broadens 
the target specificity of AvrBs3 for all four bases at No and 
tolerates mutations in the NTR. 

Activity of AvrBs3 with an HD-repeat 1 depends on To 

To test whether other RVDs in repeat 1 of AvrBs3 be- 
have similarly to NS1 we replaced HD1 with commonly 
used RVDs (NK, NH, NN, NG, NI). AvrBs3 activity was 
assessed using reporters with EBEs based on the opti- 
mal recognition specificity of the chosen RVDs (Figure 4A 
and B). Surprisingly, all analyzed RVDs resulted in good 
AvrBs3 activity irrespective of No (Figure 4C and D) sug- 
gesting that Tq is particularly important if the repeat re- 
gion starts with HD1. We termed this the 'repl effect'. 
In addition, AvrBs3 constructs containing NH1, NN1 and 
NS1 showed a slight preference for To, although differences 
were not significant. The native AvrBs3 showed an activity 



comparable to dAvrBs3-HDl (Figure 4C and D). Relative 
GUS values of all dAvrBs3 -derivatives are summarized in 
Supplementary Figure S7A, and their expression was con- 
firmed by immunoblot (Supplementary Figure S7B). No- 
tably, dAvrBs3(NSl) displayed robust and To -independent 
activity with EBEs Ai/Q/Gi, whereas NS1 together with 
EBE Ti resulted in very low activity with all four reporter 
constructs (EBE NoTi) (Figure 4C and D). This is probably 
due to the conformation of the serine in contact to thymine 
which is unfavored if present at position 1 (8). Interestingly, 
dAvrBs3(HDl) displayed similar activities with the EBEs 
Ci and Ai keeping the specificity for the base at position 0 
(Tq > Co > Aq > Go) (Supplementary Figure S7C). This 
suggests that adenine at position 1 allows a similar interac- 
tion as cytosine with the RVD HD 1 . 

To-dependency is affected by repeat number and RVD com- 
position 

Surprisingly, several RVDs in repeat 1 resulted in good 
AvrBs3 activity irrespective of Nq (Figure 4C and D). This 
result is in contrast to the strong conservation of T 0 in 
natural target sequences and published data (6,14,22,23). 
We reasoned that the To-dependency might be influenced 
by both repeat number and RVD composition. To ad- 
dress this, we shortened the AvrBs3 repeat region (17.5 re- 
peats) to obtain dAvrBs3 constructs with 13.5 to 9.5 re- 
peats and exchanged HD1 to NS1 because of its broad 
recognition specificity with the EBEs NoQ and NoAi (Fig- 
ure 5). Although there is a tendency for To preference, 
dAvrBs3(NSl)-17.5 displayed no significant difference in 
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activity with different bases at No, whereas the activity of 
dAvrBs3(HDl)-17.5 depended on No with the hierarchy T 0 
> Co > A 0 > Go ('repl effect'; Figure 5 A and B). 

Interestingly, reducing the number of repeats increased 
T 0 -dependency in all cases even if the repeat region starts 
with NS1. While AvrBs3 constructs carrying 13.5 and 
12.5 repeats displayed higher activity than AvrBs3 with 
17.5 repeats, activities of AvrBs3 constructs with 11.5 re- 
peats were comparable and To-dependent. AvrBs3 con- 
structs (HD1 and NS1) with 10.5 and 9.5 repeats showed 
weak and no activity, respectively, if combined with the 
EBEs NoCi (Figure 5A and B). All AvrBs3 derivatives were 
stably expressed (Supplementary Figure S8A). Notably, 
dAvrBs3(NSl)-11.5 displayed increased activity and re- 
duced To -dependency if combined with EBE NoAi instead 
of EBE NoCi (Figure 5C; Supplementary Figure S8B). Fur- 
thermore, if we compare the results for dAvrBs3(HDl)-l 1 .5 
rep with EBE Av rBs3(Ci) and dAvrBs3(NSl)-l 1.5 rep with 
EBEAvrBs3(Ai) the influence of RVD 1 on To -dependency be- 
comes obvious ('repl effect') (Figure 5C). Altogether, these 
data corroborate the need for Tq in case of a short repeat 
region. 



Next, we investigated whether the To -dependency and 
'repl effect' are influenced by the RVD composition. For 
this, we constructed six artificial TALEs consisting of 17.5 
repeats that differ in the RVD-composition (ARTrepl8-l, 
ARTrepl8-2 and ARTrepl8-3) and RVD1 (HD1 or NS1; 
Figure 6). Activities of ARTrepl8-l and ARTrepl8-2 with 
the corresponding EBEs (To) were comparable, whereas 
ARTrepl8-3 displayed reduced activity (Figure 6A). It ap- 
pears that a highly active TALE is less dependent on To (e.g. 
ARTrepl8-2). TALEs with HD1 and EBE G 0 showed the 
lowest activity in each case (Figure 6B). Expression of all 
constructs was shown by western blot (Figure S9C). 

Taken together, the results show a clear hierarchi- 
cal T 0 -dependency of TALEs: repeat number, the RVD- 
composition and the 'rep V effect (compare Figures 5 and 
6). 

DISCUSSION 

To-dependency and 'repl effect' 

TALE-derived DNA-binding domains serve as powerful 
tools to direct executor domains to desired target sequences. 
A considerable constraint is the dependency on a thymine at 
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position zero of the EBE(T 0 ). To be able to change the speci- 
ficity of To one needs to understand the molecular mecha- 
nism of To coordination. Here, this question was addressed 
by studying derivatives of AvrBs3 and artificial TALEs. Our 
data show that the T 0 -dependency of TALEs is affected 
by the overall RVD-composition and increases with less re- 
peats and if the EBE sequence contains mismatches. Finally, 
we discovered that repeat 1 cooperates with the degenerated 
repeats and that the RVD in repeat 1 affects the nucleotide 
specificity for T 0 . T 0 appears to be particularly important 
if the RVD of the first repeat is HD ('repl effect'; please 
see the statistical evaluation in Supplementary Figure S9). 
We think that the T 0 -dependency decreases in case of high 
DNA-binding affinity provided by a well-balanced RVD- 
composition of the canonical repeats. It was recently shown 
that the DNA-binding affinity of a given TALE depends on 
the overall RVD-composition (24). 

Natural TALE EBEs almost always start with To and, to 
our knowledge, never correspond to the TAL code-deduced 
optimal DNA sequence. Furthermore, the amount of 
TALE proteins secreted into the plant cell by Xanthomonas 



is much lower than the amount of TALE molecules pro- 
duced by J5S-driven expression in the eukaryotic cell. We, 
therefore, believe that in nature TALE activity requires To 
in the corresponding EBEs. Nevertheless, a natural TALE 
consisting of at least 17.5 repeats with a well-balanced 
RVD-composition might induce target genes independent 
of T 0 as exemplified by TalC (15). Hence, we suggest to 
consider the T 0 -dependency and 'repl effect' in off-target 
predictions for artificial TALEs and TALENs which are 
usually designed to match perfectly to the target sequence. 
Taken together, our results suggest the following hierarchy 
for To-independend TALEs: first, repeat number (17.5), fol- 
lowed by the RVD-composition (well-balanced) and finally 
the 'repl effect' (no HD1) (compare Figures 5 and 6). 



Generation of Tq -independent NTRs 

Structural data suggested that the tryptophan in repeat — 1 
(W232) of PthXol coordinates To (10). We, therefore, ana- 
lyzed whether the To specificity of AvrBs3 can be changed 
by W232 substitutions. Although AvrBs3 -derivatives with 
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W232 substitutions to aromatic side chains retained con- 
siderable activity they were less active. Obviously, T 0 speci- 
ficity could not easily be changed by single amino acid sub- 
stitutions suggesting that To coordination is more complex 
and involves additional residues. The need for an aromatic 
side chain at position 232 in AvrBs3 and To appears to de- 
pend on the RVD HD1. This hypothesis is supported by 
(i) the natural TALE TalC, which contains a tryptophan 
to cysteine substitution in repeat — 1 and a first canonical 
repeat with the RVD NS1 and by (ii) the functionality of 
W232C and W232A mutations in AvrBs3 if the repeat re- 
gion starts with NS1. Our data provide additional expla- 
nations to previous studies, in which W232 of TALEs was 
mutated (17,25). Notably, we confirmed data obtained by 
Tsuji et al (25) for a dTALE that starts with HD1 and con- 
tains 14.5 repeats. In contrast, Doyle et al (17) obtained 
variable results for W232 substitutions in PthXol (24.3 re- 



peats; NN1; EBE with four mismatches) and dTALE868 
(14.5 repeats; Nil; optimal EBE). In the latter case, how- 
ever, the results were reported to be highly variable (17). 
Recently, T 0 -independent NTRs were generated by muta- 
tion of the 'RVD-loop' of repeat -1 (22,25). Notably, a Go- 
specific NTR was generated by the double amino acid sub- 
stitution W232R/Q231S in the dTALE Avrl5 (14.5 repeats; 
Nil; optimal EBE) (22). In our study, TALEs, containing 
HD1 always worked less well if combined with an EBE that 
starts with Go. Interestingly, A 0 -specific and Co-specific but 
not Go-specific NTRs combined with a repeat region that 
starts with HD1 were described (25). This might be due to 
the fact that Go in EBEs of TALEs is unfavored if the first 
RVD of the canonical repeats is HD1 (as seen with AvrBs3 
and the ARTrepl8 constructs). One of the novel findings of 
our study is that a desired change in T 0 specificity by muta- 
genesis of the TALE NTR needs to consider different RVDs 
in repeat 1 . 

W232 and T 0 facilitate interaction of the canonical repeats 
with the target sequence 

Previously, it was suggested that the NTRs of TALEs serve 
as nucleation site for the DNA interaction (9,24). To inte- 
grate our data we propose the following model (Figure 7). 
TALEs may slide along the DNA scanning for the target 
sequence. Once the target sequence is reached, specific con- 
tacts between the canonical repeats and nucleobases occur 
from 5' to 3' and allow the repeats to compress (8,9,24). Tak- 
ing this idea into account we hypothesize that the W232- 
To interaction facilitates the specific interaction between 
canonical repeats and target nucleobases which may be 
more crucial if an HD1 contact to C\ needs to be established 
(Figure 7). We can only speculate about the strong depen- 
dency of TALEs with HD1 on To and W232. One expla- 
nation could be that the amine group of cytosine targeted 
by HD is more distant from the sugar phosphate backbone 
than N7 of guanine and adenine or the methyl group of 
thymine targeted by other RVDs (Supplementary Figure 
S10). In addition, HD is the only RVD which accepts the 
hydrogen bond from the base, whereas others donate hy- 
drogen bonds to the base or interact with the base via van 
der Waals forces (Supplementary Figure S10) (8,10,26). 

Based on TALE structures the nucleobase at position 
zero of a corresponding EBE is contacted by the repeats — 1 , 

0 and 1 . Repeat — 1 was reported to interact with the base 
by van der Waals forces (W232-T 0 ) whereas repeats 0 and 

1 coordinate the phosphate group of To via direct {oxy an- 
ion clip) and water-mediated hydrogen bonds (8,10-12). In 
contrast to Mak et al (10), a recent published structure of 
DNA-bound AvrBs3 suggests an alternative conformation 
of repeat —1 and 0, in which T 0 is coordinated by R266 
and not by W232 (11). Notably, the DNA fragment used 
for crystallization of DNA-bound AvrBs3 started at posi- 
tion —2, i.e. it did not allow the interaction of the complete 
NTR with the DNA (11). The latter, however, may stabilize 
the conformation of the NTR in a DNA-bound TALE (10). 
In both TALE-DNA complexes repeats —2 and —3 are dis- 
ordered. Mutation of R266 in AvrBs3 slightly reduced the 
overall AvrBs3 activity but did not alter To specificity. We, 
therefore, believe that the suggested role of R266 in T 0 - 
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coordination (11) is unlikely. This is corroborated by the 
fact that changes in To specificity were accomplished by mu- 
tation of W232 and neighboring residues (22,25). Consid- 
ering a conformational switch of TALEs upon target bind- 
ing one cannot exclude that the structure reported by Stella 
et al (11) may represent a stable state conformation of a 
DNA-bound TALE. Compression of the canonical repeats 
upon specific target binding may trigger a conformational 
switch of the NTR. In this case, W232-To interaction might 
be relevant for TALE target finding and the conformational 
switch initiation (Figure 7). Notably, TALE flexibility was 
underpinned by molecular dynamics simulation (12,27). To 
reveal the details of TALE-DNA interactions further struc- 
tural analyzes of TALEs with To -independent NTRs and 
TALE-DNA complexes are required. 

The data presented here provide explanations for re- 
ported variations in To -specificity of different TALEs and 
give novel insights into the mechanism of To -recognition by 
TALE proteins. Our findings will improve the design of cus- 
tomized TALE-based DNA-binding proteins, generation of 
To -independent NTRs, target prediction and off-target pre- 
vention. 
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